skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Schaeffer, Hayden"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available March 1, 2026
  2. Symbolic encoding has been used in multioperator learning (MOL) as a way to embed additional information for distinct time-series data. For spatiotemporal systems described by time-dependent partial differential equations (PDEs), the equation itself provides an additional modality to identify the system. The utilization of symbolic expressions alongside time-series samples allows for the development of multimodal predictive neural networks. A key challenge with current approaches is that the symbolic information, i.e., the equations, must be manually preprocessed (simplified, rearranged, etc.) to match and relate to the existing token library, which increases costs and reduces flexibility, especially when dealing with new differential equations. We propose a new token library based on SymPy to encode differential equations as an additional modality for time-series models. The proposed approach incurs minimal cost, is automated, and maintains high prediction accuracy for forecasting tasks. Additionally, we include a Bayesian filtering module that connects the different modalities to refine the learned equation. This improves the accuracy of the learned symbolic representation and the predicted time-series. 
    more » « less
    Free, publicly-accessible full text available January 1, 2026
  3. Abstract We provide high-probability bounds on the condition number of random feature matrices. In particular, we show that if the complexity ratio $N/m$, where $$N$$ is the number of neurons and $$m$$ is the number of data samples, scales like $$\log ^{-1}(N)$$ or $$\log (m)$$, then the random feature matrix is well-conditioned. This result holds without the need of regularization and relies on establishing various concentration bounds between dependent components of the random feature matrix. Additionally, we derive bounds on the restricted isometry constant of the random feature matrix. We also derive an upper bound for the risk associated with regression problems using a random feature matrix. This upper bound exhibits the double descent phenomenon and indicates that this is an effect of the double descent behaviour of the condition number. The risk bounds include the underparameterized setting using the least squares problem and the overparameterized setting where using either the minimum norm interpolation problem or a sparse regression problem. For the noiseless least squares or sparse regression cases, we show that the risk decreases as $$m$$ and $$N$$ increase. The risk bound matches the optimal scaling in the literature and the constants in our results are explicit and independent of the dimension of the data. 
    more » « less
  4. Particle dynamics and multi-agent systems provide accurate dynamical models for studying and forecasting the behaviour of complex interacting systems. They often take the form of a high-dimensional system of differential equations parameterized by an interaction kernel that models the underlying attractive or repulsive forces between agents. We consider the problem of constructing a data-based approximation of the interacting forces directly from noisy observations of the paths of the agents in time. The learned interaction kernels are then used to predict the agents’ behaviour over a longer time interval. The approximation developed in this work uses a randomized feature algorithm and a sparse randomized feature approach. Sparsity-promoting regression provides a mechanism for pruning the randomly generated features which was observed to be beneficial when one has limited data, in particular, leading to less overfitting than other approaches. In addition, imposing sparsity reduces the kernel evaluation cost which significantly lowers the simulation cost for forecasting the multi-agent systems. Our method is applied to various examples, including first-order systems with homogeneous and heterogeneous interactions, second-order homogeneous systems, and a new sheep swarming system. 
    more » « less